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We consider a system of m linear equations in n variables Ax = b where A is a 
given m x n matrix and b is a given m-vector known to be equal to Ax for some 
unknown solution x that is integer and k-sparse: x £ {0, 1}" and exactly k entries 
of x, are 1 . We give necessary and sufficient conditions for recovering the solution 
x exactly using an LP relaxation that minimizes i\ norm of x. When A is drawn 
from a distribution that has exchangeable columns, we show an interesting con- 
nection between the recovery probability and a well known problem in geometry, 
namely the fc-set problem. To the best of our knowledge, this connection appears 
to be new in the compressive sensing literature. We empirically show that for large 
n if the elements of A are drawn i.i.d. from the normal distribution then the perfor- 
mance of the recovery LP exhibits a phase transition, i.e., for each k there exists a 
value mofra such that the recovery always succeeds if m > in and always fails 
if m < m. Using the empirical data we conjecture that m = nH(k/n)/2 where 
H(x) = —x log 2 x — (1 — x) log 2 (l — x) is the binary entropy function. 

1 Introduction 

We consider the system of linear equations in the real vector variable x: 



where A is a given real m x n matrix, b is a given vector in R m and x € R n . Suppose it is known 
that the system has an underlying solution that is binary, i.e., 3 i 6 {0, 1}™ such that b = Ax. We 
are interested in the conditions under which x, can be recovered exactly and efficiently. 

The above problem occurs in smart grids where we wish to retrieve the underlying phase connectivity 
from a time series of meter measurements [ 1 1. Each customer household is connected to one of the 
three phases of a low-voltage transformer that distributes power to households. Both households 
and transformers have smart meters. Therefore for a series of time intervals, it is known as to 
how many watt-hours of power are sent out on each phase and how many are consumed by each 
customer. However, to which phase a customer is connected is not known. System ([TJ is the problem 
formulation for a single phase based on the principles of conservation of power. The columns of A 
hold the time series of meter measurements from customers, b holds the corresponding time series 
for a phase, and x determines if a customer is connected to that phase. Empirical data collected 
from real smart meters shows that measurements have sufficient variability over time and customers 
implying that A has full rank. 

If m — n, the unique underlying binary solution to ([JJ is recovered as x = A~ x b. If m < n, 
system ([TJ has infinite real solutions and may have multiple binary solutions. When m = 1, the 
problem reduces to Subset-Sum problem, which is NP-hard. For m < n, even if a binary solution 
to ([JJ) is given, checking if it is a unique solution is also NP-hard J2). 

To circumvent these difficulties, Mangasarian et al. JJ] first transform ([JJ) to its equivalent Ay = d 
using y = e — 2x where d = Ae — 2b and e is a column vector of all ones. Then they give necessary 
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and sufficient conditions for the uniqueness of an integer solution y G {— 1, 1}™ to the following LP 
relaxation that minimizes the norm of y: 

min 5 s.t. Ay = d, ~5e < y < Se. (2) 

A solution y that is unique to LP (T2J and is integer guarantees that (e — y) /2 exactly recovers x. 
The paper [31 also computes the probability that a randomly generated problem instance of LP (T5J 
satisfies the uniqueness conditions. This gives a lower bound on the probability that ([1} has a unique 
binary solution. For large n, a transition behavior is observed: the probability of uniqueness is 
almost for m/n < 1/2 and almost 1 for m/n > 1/2. 

In this work, we follow the approach of [ 3 ] and study the conditions under which a binary and It- 
sparse x with exactly k non-zero entries is a unique solution to the following alternate LP relaxation 
that minimizes the l\ norm of x: 

min e T x s.t. Ax — b, < x < 1 (3) 

so that ([3]) exactly recovers the solution x of ([TJ. As in (5), we wish to find necessary and sufficient 
conditions for the uniqueness of a binary fc-sparse solution and to compute the probability of the 
conditions getting satisfied on a random instance as a function of n, m, fc. 

Donoho et al. J4| also consider recovery of sparse solutions to ([TJ, albeit with a different notion of 
sparsity. They consider three LP relaxations of which two are relevant to our work: 

min T x s.t. Ax — b, < x < 1 and (4) 

min e T x s.t. Ax — b, x > 0. (5) 

Donoho et al.'s definition of sparsity is closely tied to the polytope defined by the constraints of 
the LP relaxation. A signal is considered fc-sparse if it lies on a fc-face of the constraint polytope. 
However in LP ([3j a fc-sparse binary signal does not lie on the fc-face of its associated polytope. 
Therefore their techniques of counting faces of polytopes to compute uniqueness probabilities give 
us partial results, however they are not tight. 

For example, in case of LP |4j, x is considered fc-sparse if it has n — k entries either or 1 and k 
entries in (0, 1). Any binary signal lies on a vertex of LP |4j's constraints polytope < x < 1 and 
has zero sparsity. Donoho et al. show that LP (|4j requires m/n — 1/2 to recover a signal of zero 
sparsity (see figure 3 of [4|, Q — I). This is not the strongest result possible because LP (fJJ can 
recover certain binary signals with m/n strictly less than 1/2. In case of LP |5j, x is considered 
fc-sparse if n — fc entries are and fc entries > 0. In this case although their definition of sparsity 
coincides with ours, the results for recovery are not tight. For m/n = 1/2, LPjSJ recovers a binary 
signal of sparsity at most fc = n/4 (see figure 3 of H, Q = T) while LP ([3} can even recover a 
binary signal of sparsity fc = n/2. 

The rest of the paper is organized as follows. Section [2] gives necessary and sufficient conditions 
for a binary and fc-sparse signal to be the unique solution of LP (TJJ. Section[3]links the probability 
of uniqueness with the expected number of fc-sets in a random set of points from R m . The latter 
problem is still open. Section[4]presents experimental results and compares the performance of LPs. 

2 Conditions for a fc-sparse x G {0,1}" to be the unique solution of LP (|3j 

We use the necessary and sufficient conditions for a given solution to be the unique optimal solution 
of an LP. These are derived in [5 ] and shown below for reference. 

Theorem 1 ([5 1, Theorem 2(iii)). Let x be a solution of the LP: 

min c T x s.t. Gx = h, Px > q. (6) 

Let P eq denote the submatrix of P consisting of rows of Px > q which are tight at x, i.e., all rows i 
such that piX — q t . Then x is unique if and only if there exists no z satisfying 

Gz = 0, P eq z > 0, c T z < 0, z/0. (7) 

Using the results of Theorem[TJ we establish our main theoretical result: 

Theorem 2. Let x € {0, 1}™ be a k-sparse solution of ([TJ, Jo = {j : Xj — 0} and J\ = [n] \ Jq. 
Consider the points in R m corresponding to the columns of A. Let the points corresponding to 
Qq = {Aj : j £ Jq} be colored red and the rest of the points Q± — {Aj : j £ Jq} be colored green. 
Then x is the unique solution to LP (T5J if and only if there exists a hyperplane in R m not passing 
through the origin that strictly separates the red points Qofrom the green points Q±. 
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Proof. We rewrite LP ([3) in the form of (j6j: 

min e T x s.t. Ax = b, x > 0, — x > — e. 

The inequalities that are tight at x are Xj > for all j G Jo and — Xj > — 1 for all j G Ji. Thus by 
Theorem[T] a; is a unique solution if and only if there is no solution to the following system: 

Az = 0, zj > Vj G Jo, -Zj > Vj G Ji, e T z < 0, z ^ 
which is equivalent to the following system using substitution yj = Zj if j G Jo, — Zj otherwise: 

E y> A i = E w^j' • v j E % • E f ^ °- (8) 

Introduce a new variable yo > in Jo with the column Aq = 0. Then system ([8]) is equivalent to: 

E-'/" 1 / E"" 1 '- E^ = E%' y - ' 

jeJo ieJi jeJo ie^i 

which is equivalent to the following system using the substitution aj = yj/ J2je h % ^ or J ' e ^* ' 

E"" 1 ' E"-" 1 '- E a J = E a J = X ' a - ' a ^°- 
jeJ j'eJi j'eJo je./i 

Thus x is the unique optimal solution of LP <|3j if and only if there exist no common points between 
the two convex sets conv(QoU{0}) and conv\Q\), i.e., there exists a hyperplane not passing through 
origin that strictly separates Qq from Q\. □ 

3 Probability of uniqueness for LP ([3]) 

We obtain an initial expression for P m ,n,k, the probability that a hyperplane separates the two point 
sets Qo U {0} and Qi given by an arbitrary but fixed fc-sparse x G {0, 1}™ for a random A G R mxn . 
A k-set of a finite point set S in the euclidean space is a subset of k elements of S that can be strictly 
separated from the remaining points by a hyperplane. 

Theorem 3. Suppose A is drawn from a distribution that has exchangeable columns, i.e. every 
permutation n applied to the columns of A leaves the distribution of A unchanged. If X is the 
random variable denoting the number of k-sets of the points in R m corresponding to the columns of 
AthenP m , n , k =E[X\/Q. 

Proof. Let t = (?) , S\ , S2 , ■ ■ • , St be all possible subsets of columns of A of size k and X$ be the 
indicator random variable if Si is indeed a A;-set. Then X = Y^\=i %i an( ^ E[X] = J2l=i E[Xi] = 
J2l=i P[Xi = 1]. Since the distribution of the columns of A remains unchanged after any permu- 
tation, we can show that any property that is satisfied on a subset of size k of the columns is also 
equally probable to be satisfied on another fc-subset. Hence any subset of size k is equally likely to 
be a fc-set and hence has probability P m ,n,k- Thus E[X] = Y^\=i Pm,n,k — tPm,n,k implying that 
P m , n ,k = E[X}/(£). ' ^ □ 

Finding the number of fc-sets of an arbitrary set of points is a long standing open problem J6). The 
paper [7 | is the only known work that gives an upper bound on the expected number of fc-sets of a 
random set of points. However we seek a good lower bound. 

4 Experimental results 

We conduct Monte-Carlo simulations and test LPs |2]), ([3]) and |5]l for recovery. A single simulation 
experiment with parameters F, n, m, fc, and D consists of generating a random m x n matrix A 
with entries from distribution D, generating two random solutions x G {0, 1}™, y G {—1, 1}™ each 
having exactly fc l's, computing b — Ax,d = Ay and solving the LP F. We record a success 
if the solution of F matches the solution we started with, with a relative error of 10 -9 . For each 
combination of n, m, fc, D we consider 200 random instances of (A, x, y) and record the success 
rate as the fraction Rp of instances for which LP F succeeds. We repeat the experiments by varying 
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n,m,k, D. We consider the following distributions for D. D\ : 
normal distribution M{[i = 0,er 2 =l), D 2 : 7V(100, 1), D 3 : 



each entry of A comes from standard 
Uniform distribution U(0, 100), and 



D^: each column Aj of A comes from Af([ij, 1) where p,j ~ U(0, 100). We experiment with 
n — {200, 500, 1600} and for each n we vary m and fc. 

For the first set of experiments, following [4], we vary m from n/10 to 9n/10 in 17 equal steps and 
for each n, m we vary k from 1 to m in m/4 equal steps. We consider the experimental data on a 
3D space where the undersampling factor S — m/n is on the X-axis , the relative sparsity factor 
p = k/m on the Y-axis, and the success rates of LPs on the Z-axis. Fig. 1 a) shows the label sets for 
three different success rates 0.1, 0.5, 0.9 for n — 500 using distribution D\ . The lower three curves 
correspond to LP |5]) and the upper three to LP Q. For each LP, the three label sets represents a 
narrow transition zone below which the success rate is almost 1 and above which success rate is 
almost 0. The transition zone becomes thinner as n increases. We see that <(3j outperforms (|5). 

For the second set of experiments, following Q, we vary k from n/10 to 9n/10 in 17 equal steps 
and vary m from 0.02n to 0.98n in 25 steps. Here again, we consider the experimental data on 3-D 
space where absolute sparsity factor, r\ = k/n is on the X-axis, 5 on the Y-axis and the success rates 
of LPs on the Z-axis. Fig. 1 Vj\ shows the label sets for success rate 0.5 for n = 500 of LP |2) using 
distributions D\ , D2 and of LP Q using D\ . Results for other distributions are similar and are not 
shown here. We call the value of i5 for which success rate crosses 0.5 the transition point. If we 
consider the transition points as a function of r\, then the label sets for LP ([3]l suggest the following 
conjecture: 

Conjecture 4. If the entries of A are i.i.d. from some absolutely continuous distribution, then for 
large n, LP ([3]) exactly recovers a binary k-sparse solution to Q using nH{k/n)/2 measurements. 
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l|{b)| also plots the function k log 2 (n/k), which is a lower bound on the number of measurements 
to recover general fc-sparse signals given by compressed sensing. We see that for 77 < 0.5, 



Fig 

LP ( 3} requires fewer measurements for binary signals 
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(a) Transition behavior of LPs |3},([5} in (p, S) space (b) Transition behavior of LPs |2]l,(3} in (S, rj) space 

Figure 1 : Simulation Results 

References 



[1] V. Arya, D. Seetharam, S. Kalyanaraman, K. Dontas, C. Pavlovski, S. Hoy, and Kalagnanam J. R. Phase 
identification in smart grids. In 2nd IEEE International Conference on Smart Grid Communications, 20 1 1 . 

[2] C.H. Papadimitriou. On the complexity of unique solutions. Journal of the ACM (JACM), 31(2):392~400, 
1984. 

[3] O.L. Mangasarian and Benjamin Recht. Probability of unique integer solution to a system of linear equa- 
tions. European Journal of Operational Research, 214(1):27 - 30, 201 1. 

[4] D.L. Donoho and I. Tanner. Precise undersampling theorems. Proceedings of the IEEE, 98(6):913-924, 
2010. 

[5] O.L. Mangasarian. Uniqueness of solution in linear programming. Linear algebra and its applications, 
25:151-162, 1979. 

[6] G. Toth. Point sets with many k-sets. Discrete and Computational Geometry, 26(2):187-194, 2001. 

[7] K.L. Clarkson. On the expected number of k-sets of coordinate-wise independent points, http : / / cm . 
|bell-Iabs . com/ who/ clarkson/cwi_ksets/p . pdf | 



4 



